| After | Before | Difference | |
|---|---|---|---|
| Treated group | 6.57 | 7.12 | -0.55 |
| Control group | 7.42 | 7.22 | 0.20 |
| Difference | -0.85 | -0.10 | -0.75 |
Data Analytics for Finance
We did not answer this question in the previous lecture!
DiLLMA example: controlling for study hours and attendance rate
#TODO: Add more examples
What is Difference‑in‑Differences (DID)?
Difference‑in‑Differences (DID) is a quasi‑experimental method that exploits within‑group variation over time and cross‑group variation to identify a causal effect when random assignment is infeasible.
Compare outcomes before and after treatment implementation, e.g. pre- and post-policy change
All variation in Treatment is explained by Time!
Compare outcomes between treated and control groups, e.g. those affected by a policy change vs those not affected
Differences between Treated and Control groups may be driven by time-invariant confounders, e.g. ability, demographics, location, etc.
Combining both, allows to isolating the causal impact of the treatment a.k.a. average treatment effect on the treated (ATT)
DiD only recovers the causal effect if the “parallel trends assumption” holds!
Compare grade changes in allowed vs. banned courses, before and after LLMs became available
DiD isolates the treated group’s response, conditional on the assumption that the untreated group’s changes represent the non-treatment counterfactual for the treated group
| (1) After | (2) Before | (1) - (2) | |
|---|---|---|---|
| (a) Treatment | Y\(_{treated,\ after}\) | Y\(_{treated,\ before}\) | \(\Delta_{treated}\) |
| (b) Control | Y\(_{control,\ after}\) | Y\(_{control,\ before}\) | \(\Delta_{control}\) |
| (a) - (b) | \(\Delta_{after}\) | \(\Delta_{before}\) | DiD |
\[Y = \beta_0 + \beta_1 Treated + \beta_2 After + \beta_3 Treated \times After + \epsilon\]
The difference-in-differences regression gives you the same estimate as if you took differences in the group averages
It takes also care of any unobserved constant differences between subjects and time trends!
| (1) After | (2) Before | (1) - (2) | |
|---|---|---|---|
| (a) Treatment | \(\beta_0 + \beta_1+\beta_2+\beta_3\) | \(\beta_0 + \beta_1\) | \(\beta_2+\beta_3\) |
| (b) Control | \(\beta_0 + \beta_2\) | \(\beta_0\) | \(\beta_2\) |
| (a) - (b) | \(\beta_1+\beta_3\) | \(\beta_1\) | \(\beta_3\) |
\[Y = \beta_0 + \beta_1 Treated + \beta_2 After + \beta_3 Treated \times After + \epsilon\]
The difference-in-differences regression gives you the same estimate as if you took differences in the group averages
It takes also care of any unobserved constant differences between subjects and time trends!
treated: Indicator for whether the course allows LLM use (1 = yes, 0 = no)after: Indicator for whether the observation is from the post-LLM period (1 = after, 0 = before)exam_score: The student’s exam score| After | Before | Difference | |
|---|---|---|---|
| Treated group | 6.57 | 7.12 | -0.55 |
| Control group | 7.42 | 7.22 | 0.20 |
| Difference | -0.85 | -0.10 | -0.75 |
\[Y = \alpha_i + \alpha_t + \beta_3 Treated \times After + \epsilon\]
The parallel trends assumption states that, in the absence of treatment, the average change in the outcome variable would have been the same for both the treatment and control groups.
Limitations
Limitations arise in rollout (staggered) designs, where treatment timing varies across groups; TWFE can perform poorly in such settings (to be discussed later).
A way to extend your MSc replication project is to apply some of the new DiD methods to the DiLLMa dataset and compare the results with the traditional TWFE approach.
Thank You for Your Attention!
See You in the Next One!
Data Analytics for Finance